This vignette will show you how the telprep package can be used to automatically process data from telemetry flights. The program is designed to read in raw txt files, filter out erroneous signals, determine the living status of fish, and create some pretty pictures. This tutorial highlights the general workflow of the program. For a finer scale description of the program functionalities, see the help files for the following functions: read_flight_data, channels_merge, replace_date, combine_data, get_date_bins, rm_land_detects, get_best_detections, get_locations, flag_dead_fish, hmm_survival, and make_plot.

Program Workflow:

Step 1 – Read raw detection data into R

Begin by sticking the raw txt files into a folder. Txt files from multiple flights can and should be included in the folder. When naming the txt files, the flight grouping, flight date, and the receiver location (belly/wing) should be included (eg: F1_1-26-19_Belly.TXT) as this information will be used later on. Find the directory of the folder (my example files are in “D:/Jordy/flight-data/”) and store the coordinate reference systems of the txt files (the proj4string) in the variable crs_in. The function read_flight_data can now be used to read the raw data into R and store it in the variable raw_data:

directory <- "D:/Jordy/flight-data/"
crs_in <- "+proj=longlat +datum=NAD83 +no_defs +ellps=GRS80 +towgs84=0,0,0"
raw_data <- read_flight_data(directory, crs_in)

Structurally, raw_data is a list of data.frames (there is one for each txt file in the folder). To see how these data.frames are ordered, run

names(raw_data)
##  [1] "tburb_f1_1-26-19-BELLY.TXT"                      
##  [2] "tburb_f1_1-26-19-WING.TXT"                       
##  [3] "tburb_f1_1-27-19-BELLY.TXT"                      
##  [4] "tburb_f1_1-27-19-WING.TXT"                       
##  [5] "tburb_f2_2-6-19-BELLY.TXT"                       
##  [6] "tburb_f2_2-6-19-WING.TXT"                        
##  [7] "tburb_f2_2-7-19-BELLY.TXT"                       
##  [8] "tburb_f2_2-7-19-WING.TXT"                        
##  [9] "tburb_f3_5-19-19-BELLY.TXT"                      
## [10] "tburb_f3_5-19-19-WING.TXT"                       
## [11] "tburb_f3_5-20-19-BELLY.TXT"                      
## [12] "tburb_f3_5-20-19-WING.TXT"                       
## [13] "tburb_f4_July-19-BELLY.TXT"                      
## [14] "tburb_f4_July-19-WING.TXT"                       
## [15] "tburb_f5_10-17-19-BELLY.TXT"                     
## [16] "tburb_f5_10-17-19-WING.TXT"                      
## [17] "tburb_f5_10-7-19-BELLY.TXT"                      
## [18] "tburb_f5_10-7-19-WING.TXT"                       
## [19] "tburb_f6_Dec-19-BELLY.TXT"                       
## [20] "tburb_f6_Dec-19-WING.TXT"                        
## [21] "tburb_f7_Jan-20-AG-1stFL-LEFT ANTENNA-WING.TXT"  
## [22] "tburb_f7_Jan-20-AG-1stFL-RIGHT ANTENNA-BELLY.TXT"
## [23] "tburb_f7_Jan-20-AG-2ndFL-LEFT ANTENNA-WING.TXT"  
## [24] "tburb_f7_Jan-20-AG-2ndFL-RIGHT ANTENNA-BELLY.TXT"
## [25] "tburb_f8_Feb-20-BELLY.TXT"                       
## [26] "tburb_f8_Feb-20-WING.TXT"

The contents of “tburb_f1_1-26-19-WING.TXT” are stored in the 2nd data.frame in the list.

Step 2 – If channels or dates were misprogrammed, make corrections

If a channel or date was misprogrammed, it needs to be corrected before the contents of raw_data can be combined into a single data.frame.

Misprogrammed Channels

Suppose that channel 3 was misprogrammed as channel 10 in “tburb_f1_1-26-19-WING.TXT”. To peek at the data run

head(raw_data[[2]])
##              DateTime Channel TagID Antenna Power       Y        X Status
## 1 2019-01-25 14:10:48      35    80     AH0   120 7203409 749053.1 Active
## 3 2019-01-26 09:30:09      10    42     AH0   111 7192887 747589.6   Mort
## 4 2019-01-26 09:30:21      10    42     AH0   124 7192611 748175.1   Mort
## 5 2019-01-26 09:30:46      10    42     AH0   129 7192046 749351.5   Mort
## 6 2019-01-26 09:30:58      10    42     AH0    97 7191769 749951.0   Mort
## 7 2019-01-26 09:31:10      10    42     AH0    61 7191478 750554.0   Mort

To replace channel 10 with channel 3, the following line of code is run:

raw_data[[2]] <- channels_merge(raw_data[[2]], 10, 3)
head(raw_data[[2]])
##              DateTime Channel TagID Antenna Power       Y        X Status
## 1 2019-01-25 14:10:48      35    80     AH0   120 7203409 749053.1 Active
## 3 2019-01-26 09:30:09       3    42     AH0   111 7192887 747589.6   Mort
## 4 2019-01-26 09:30:21       3    42     AH0   124 7192611 748175.1   Mort
## 5 2019-01-26 09:30:46       3    42     AH0   129 7192046 749351.5   Mort
## 6 2019-01-26 09:30:58       3    42     AH0    97 7191769 749951.0   Mort
## 7 2019-01-26 09:31:10       3    42     AH0    61 7191478 750554.0   Mort
Misprogrammed Dates

Suppose that a date was misprogrammed in the file “tburb_f5_10-17-19-WING.TXT”. To peek at the data, run

head(raw_data[[16]])
##               DateTime Channel TagID Antenna Power       Y        X Status
## 1  2003-04-25 08:14:11      40    30      A0    49 7203430 749046.0 Active
## 3  2003-04-25 08:14:21      40    26      A0    65 7203428 749045.4 Active
## 5  2003-04-25 08:14:23      40    30      A0    52 7203428 749045.6 Active
## 7  2003-04-25 08:14:24      40    25      A0    49 7203428 749045.7 Active
## 11 2003-04-25 08:14:32      40    57      A0    49 7203427 749045.6   Mort
## 12 2003-04-25 08:14:33      40    31      A0    69 7203427 749045.7 Active

If the correct flight date was “10/17/19”, the following line of code will make the correction:

raw_data[[16]] <- replace_date(raw_data[[16]], new_date ="10/17/19") 
head(raw_data[[16]])
##               DateTime Channel TagID Antenna Power       Y        X Status
## 1  2019-10-17 08:14:11      40    30      A0    49 7203430 749046.0 Active
## 3  2019-10-17 08:14:21      40    26      A0    65 7203428 749045.4 Active
## 5  2019-10-17 08:14:23      40    30      A0    52 7203428 749045.6 Active
## 7  2019-10-17 08:14:24      40    25      A0    49 7203428 749045.7 Active
## 11 2019-10-17 08:14:32      40    57      A0    49 7203427 749045.6   Mort
## 12 2019-10-17 08:14:33      40    31      A0    69 7203427 749045.7 Active

Step 3 – Combine the raw data between receivers and across flights

The function combine_data combines all of the data stored in raw_data into a single data.frame.

An argument (source_vec) is provided so that the source of each txt file can be specified. The argument can, for instance, be used to specify whether a receiver was located on the belly or the wing of the aircraft. Because 26 txt files were contained in the example folder, a vector of length 26 is used to encode this information:

names(raw_data)
##  [1] "tburb_f1_1-26-19-BELLY.TXT"                      
##  [2] "tburb_f1_1-26-19-WING.TXT"                       
##  [3] "tburb_f1_1-27-19-BELLY.TXT"                      
##  [4] "tburb_f1_1-27-19-WING.TXT"                       
##  [5] "tburb_f2_2-6-19-BELLY.TXT"                       
##  [6] "tburb_f2_2-6-19-WING.TXT"                        
##  [7] "tburb_f2_2-7-19-BELLY.TXT"                       
##  [8] "tburb_f2_2-7-19-WING.TXT"                        
##  [9] "tburb_f3_5-19-19-BELLY.TXT"                      
## [10] "tburb_f3_5-19-19-WING.TXT"                       
## [11] "tburb_f3_5-20-19-BELLY.TXT"                      
## [12] "tburb_f3_5-20-19-WING.TXT"                       
## [13] "tburb_f4_July-19-BELLY.TXT"                      
## [14] "tburb_f4_July-19-WING.TXT"                       
## [15] "tburb_f5_10-17-19-BELLY.TXT"                     
## [16] "tburb_f5_10-17-19-WING.TXT"                      
## [17] "tburb_f5_10-7-19-BELLY.TXT"                      
## [18] "tburb_f5_10-7-19-WING.TXT"                       
## [19] "tburb_f6_Dec-19-BELLY.TXT"                       
## [20] "tburb_f6_Dec-19-WING.TXT"                        
## [21] "tburb_f7_Jan-20-AG-1stFL-LEFT ANTENNA-WING.TXT"  
## [22] "tburb_f7_Jan-20-AG-1stFL-RIGHT ANTENNA-BELLY.TXT"
## [23] "tburb_f7_Jan-20-AG-2ndFL-LEFT ANTENNA-WING.TXT"  
## [24] "tburb_f7_Jan-20-AG-2ndFL-RIGHT ANTENNA-BELLY.TXT"
## [25] "tburb_f8_Feb-20-BELLY.TXT"                       
## [26] "tburb_f8_Feb-20-WING.TXT"
source_vec <- c(rep(c("belly","wing"), 10), rep(c("wing","belly"),2), c("belly","wing"))
source_vec
##  [1] "belly" "wing"  "belly" "wing"  "belly" "wing"  "belly" "wing"  "belly"
## [10] "wing"  "belly" "wing"  "belly" "wing"  "belly" "wing"  "belly" "wing" 
## [19] "belly" "wing"  "wing"  "belly" "wing"  "belly" "belly" "wing"

Now that the source of the data has been specified, the function combine_data can be used to combine the contents of raw_data:

all_data <- combine_data(raw_data, source_vec)
head(all_data)
##                 DateTime Channel TagID Antenna Power       Y        X Status
## 1100 2019-01-25 14:10:48      35    80     AH0   120 7203409 749053.1 Active
## 1    2019-01-26 09:15:01      63    19     AH0    43 7195053 742058.9 Active
## 2    2019-01-26 09:15:03      63    74     AH0    42 7195024 742140.0 Active
## 8    2019-01-26 09:15:52      63     3     AH0    46 7194144 744417.7 Active
## 16   2019-01-26 09:16:53      10    42     AH0   120 7192957 747439.5   Mort
## 19   2019-01-26 09:17:17      10    42     AH0   119 7192399 748614.9   Mort
##      Source
## 1100   wing
## 1     belly
## 2     belly
## 8     belly
## 16    belly
## 19    belly

Step 4: Import geographic data

In order to use the telprep package, a SpatialLinesDataFrame representation of the river system must be imported into R. The following lines of code can be used to import a shapefile (named example.shp) as a SpatialLinesDataFrame object using the readOGR function from the rgdal package:

setwd("D:/Jordy/telprep/telprep/data/sf")
sldf <- rgdal::readOGR("example.shp")
## OGR data source with driver: ESRI Shapefile 
## Source: "D:\Jordy\telprep\telprep\data\sf\example.shp", layer: "example"
## with 1 features
## It has 1 fields

The coordinate reference system of the geographic data must match that of the detection data. If the coordinate reference systems do not match, the following line of code will convert the coordinate reference system of the geographic data to that of the detection data.

sldf <- sp::spTransform(sldf, attr(all_data, "crs"))

Step 5: Remove false signals and determine the location of each fish during a set of detection periods

The riverdist package is used internally for calculations related to river proximity. To use the functionality of this package, sldf (from Step 4) must be converted into a river_network object. The riverdist function line2network can be used to make the conversion (the user is referred to the riverdist package for help).

river_net <- riverdist::line2network(sp=sldf, tolerance = 500)
## 
##  Units: m 
## 
##  Removed 1 duplicate segments. 
## 
##  Removed 90 segments with lengths shorter than the connectivity tolerance.

The function rm_land_detects can be used to discard the detections that occurred away from the river system. To remove the detections that occurred more than 500 m away from a river channel and store the data in a variable called river_detects, the following line of code is run:

river_detects <- rm_land_detects(all_data, river_net, dist_thresh = 500)
## [1] "be patient -- this could take a few minutes"

The function get_best_locations can be used to determine the best location for each fish in each detection period. In short, the best location is considered to be the location where the highest power detection occurred during each flight period. The date_bins argument of the get_best_locations function specifies the start and end dates of the detection periods. These dates can be found and formatted using the get_date_bins function:

names(raw_data)
##  [1] "tburb_f1_1-26-19-BELLY.TXT"                      
##  [2] "tburb_f1_1-26-19-WING.TXT"                       
##  [3] "tburb_f1_1-27-19-BELLY.TXT"                      
##  [4] "tburb_f1_1-27-19-WING.TXT"                       
##  [5] "tburb_f2_2-6-19-BELLY.TXT"                       
##  [6] "tburb_f2_2-6-19-WING.TXT"                        
##  [7] "tburb_f2_2-7-19-BELLY.TXT"                       
##  [8] "tburb_f2_2-7-19-WING.TXT"                        
##  [9] "tburb_f3_5-19-19-BELLY.TXT"                      
## [10] "tburb_f3_5-19-19-WING.TXT"                       
## [11] "tburb_f3_5-20-19-BELLY.TXT"                      
## [12] "tburb_f3_5-20-19-WING.TXT"                       
## [13] "tburb_f4_July-19-BELLY.TXT"                      
## [14] "tburb_f4_July-19-WING.TXT"                       
## [15] "tburb_f5_10-17-19-BELLY.TXT"                     
## [16] "tburb_f5_10-17-19-WING.TXT"                      
## [17] "tburb_f5_10-7-19-BELLY.TXT"                      
## [18] "tburb_f5_10-7-19-WING.TXT"                       
## [19] "tburb_f6_Dec-19-BELLY.TXT"                       
## [20] "tburb_f6_Dec-19-WING.TXT"                        
## [21] "tburb_f7_Jan-20-AG-1stFL-LEFT ANTENNA-WING.TXT"  
## [22] "tburb_f7_Jan-20-AG-1stFL-RIGHT ANTENNA-BELLY.TXT"
## [23] "tburb_f7_Jan-20-AG-2ndFL-LEFT ANTENNA-WING.TXT"  
## [24] "tburb_f7_Jan-20-AG-2ndFL-RIGHT ANTENNA-BELLY.TXT"
## [25] "tburb_f8_Feb-20-BELLY.TXT"                       
## [26] "tburb_f8_Feb-20-WING.TXT"
flight_group <- c(1,1,1,1,2,2,2,2,3,3,3,3,4,4,5,5,5,5,6,6,7,7,7,7,8,8)
date_bins <- get_date_bins(raw_data, flight_group)
date_bins
##      [,1]       [,2]      
## [1,] "01/25/19" "01/27/19"
## [2,] "01/28/19" "02/07/19"
## [3,] "05/19/19" "05/24/19"
## [4,] "07/24/19" "07/27/19"
## [5,] "08/05/19" "10/17/19"
## [6,] "12/16/19" "12/18/19"
## [7,] "01/15/20" "01/21/20"
## [8,] "02/05/20" "02/08/20"

After the detection periods have been specified, the function get_best_locations can be used to determine the locations the fish:

best_locations <- get_best_locations(river_detects, date_bins =  date_bins, bin_by=NA, n_thresh = 5, dist_max = 5000, remove_flagged = F)
head(best_locations$all_detects)
##                  DateTime Channel TagID Antenna Power       Y        X Status
## 1     2019-01-26 09:15:01      63    19     AH0    43 7195053 742058.9 Active
## 2     2019-01-26 09:15:03      63    74     AH0    42 7195024 742140.0 Active
## 107   2019-01-26 09:44:04      82     2     AH0    50 7144580 833903.8 Active
## 11    2019-01-26 09:46:04       3    47     AH0   105 7164242 795278.5   Mort
## 11100 2019-01-26 09:46:04      10    47     AH0   105 7164242 795278.5   Mort
## 1610  2019-01-26 09:46:37       3    47     AH0   132 7165018 797033.3   Mort
##       Source BestSignal FlightNum    Dist Records
## 1      belly       TRUE         1   0.000       1
## 2      belly      FALSE         1 327.404       2
## 107    belly      FALSE         1 289.790       3
## 11      wing      FALSE         1   2.877       7
## 11100   wing      FALSE         1   3.414      10
## 1610    wing      FALSE         1   1.425       7
head(best_locations$best_detects)
##                DateTime Channel TagID Power       Y         X Status Source
## 1   2019-01-26 09:15:01      63    19    43 7195053  742058.9 Active  belly
## 134 2019-01-26 09:48:45      82    10    52 7136212  849867.7 Active  belly
## 56  2019-01-26 11:01:53      35    85   162 7039969 1059337.4 Active   wing
## 192 2019-01-26 11:03:04      10    83   183 7030325 1072930.3 Active  belly
## 193 2019-01-26 11:03:05      22    84   121 7030359 1072892.3 Active  belly
## 202 2019-01-26 11:05:36      34    82   157 7034868 1066905.4 Active  belly
##     FlightNum Records  flag
## 1           1       1  TRUE
## 134         1       1  TRUE
## 56          1      11 FALSE
## 192         1      13 FALSE
## 193         1       1  TRUE
## 202         1       1  TRUE

**best_locations$all_detects* adds some useful to all_data: -BestSignal is the signal with the highest power in a detection period -Dist is the Euclidean distance (in km) between the detection location and the associated highest power detection -FlightNum is the detection period -Records is number of times that a fish was detected in a detection period.

**all_detects$best_detects* contains the highest power detections only. -These detections are flagged if there are fewer than n_thresh detections within a distance of dist_max km from the best detection during the detection period. -The detection will also be flagged if a positive linear relationship exists between Power and Dist for all detections within dist_max km from the best signal in the detection period (i.e. the signal strength increases as the best detection is approached).

Step 6 – Determine the living status of the fish

Two functions (flag_dead_fish and hmm_survival) are provided to help determine the living status of fish.

flag_dead_fish uses locational information to determine which fish have expired. If a fish moves less than dist_thresh km for all consecutive detection periods following a detection, the fish will be flagged as dead. The following lines of code will flag for dead fish using this approach:

best_detects <- best_locations$best_detects
head(best_detects)
##                DateTime Channel TagID Power       Y         X Status Source
## 1   2019-01-26 09:15:01      63    19    43 7195053  742058.9 Active  belly
## 134 2019-01-26 09:48:45      82    10    52 7136212  849867.7 Active  belly
## 56  2019-01-26 11:01:53      35    85   162 7039969 1059337.4 Active   wing
## 192 2019-01-26 11:03:04      10    83   183 7030325 1072930.3 Active  belly
## 193 2019-01-26 11:03:05      22    84   121 7030359 1072892.3 Active  belly
## 202 2019-01-26 11:05:36      34    82   157 7034868 1066905.4 Active  belly
##     FlightNum Records  flag
## 1           1       1  TRUE
## 134         1       1  TRUE
## 56          1      11 FALSE
## 192         1      13 FALSE
## 193         1       1  TRUE
## 202         1       1  TRUE
best_detects <- best_detects[best_detects$flag==F,]
flagged_fish <- flag_dead_fish(best_detects, dist_thresh = 0.5)
head(flagged_fish)
##                 DateTime Channel TagID Power       Y       X Status Source
## 56   2019-01-26 11:01:53      35    85   162 7039969 1059337 Active   wing
## 192  2019-01-26 11:03:04      10    83   183 7030325 1072930 Active  belly
## 69   2019-01-26 11:14:41      35    84   152 7026735 1076430   Mort   wing
## 74   2019-01-26 11:15:03      35    83   169 7027807 1075850 Active   wing
## 91   2019-01-26 11:16:30       3    83   155 7030738 1072477 Active   wing
## 1021 2019-01-26 11:17:54       3    82   171 7033727 1069361 Active   wing
##      FlightNum Records  flag MoveDist MortFlag
## 56           1      11 FALSE       NA       No
## 192          1      13 FALSE       NA       No
## 69           1      10 FALSE       NA       No
## 74           1      12 FALSE       NA       No
## 91           1      11 FALSE       NA       No
## 1021         1       7 FALSE       NA       No

hmm_survival operates similarly to flag_dead_fish; however, this function uses a more sophisticated method to determine the living status of the fish. Briefly, this function utilizes locational and mortality sensor related information to determine the most likely path of survival states (called the viterbi path) for each fish using a Hidden Markov Model (HMM). A benefit to using a HMM based approach is that detection probabilities and survival rates are estimated using a statistical approach. A detailed description of the HMM can be found by running vignette(“hmm”) in the console.

The following lines of code will fit the HMM to determine the living status of the fish:

library(msm)
hmm_out <- hmm_survival(best_detects)
hmm_out$results
## [[1]]
##                        estimate     lower     upper
## annual survival rate  0.8596032 0.8184353 0.8928571
## annual mortality rate 0.1403968 0.1071429 0.1815647
## 
## [[2]]
##                                     estimate     lower     upper
## detection probability live fish    0.2995275 0.2459589 0.3592053
## detection probability expired fish 0.9256420 0.6319511 0.9890412
## 
## [[3]]
##      [,1]                                                                              
## [1,] "the mortality signals work for live fish approximately 47 percent of the time"   
## [2,] "the mortality signals work for expired fish approximately 97 percent of the time"
head(hmm_out$viterbi)
##                 DateTime Channel TagID Power       Y       X Status Source
## 56   2019-01-26 11:01:53      35    85   162 7039969 1059337 Active   wing
## 192  2019-01-26 11:03:04      10    83   183 7030325 1072930 Active  belly
## 69   2019-01-26 11:14:41      35    84   152 7026735 1076430   Mort   wing
## 74   2019-01-26 11:15:03      35    83   169 7027807 1075850 Active   wing
## 91   2019-01-26 11:16:30       3    83   155 7030738 1072477 Active   wing
## 1021 2019-01-26 11:17:54       3    82   171 7033727 1069361 Active   wing
##      FlightNum Records  flag Viterbi
## 56           1      11 FALSE       1
## 192          1      13 FALSE       1
## 69           1      10 FALSE       1
## 74           1      12 FALSE       1
## 91           1      11 FALSE       1
## 1021         1       7 FALSE       1

In the column hmm_out$viterbi$Viterbi, a value of 1 cooresponds to the event that the fish is alive whereas 2 cooresponds to the event that the fish has expired.

Step 7 – Create some pretty pictures

A plotting function (make_plot) is included in the telprep package. This function is designed to be used throughout the analysis. Example of how the function can be used are provided here:

# basic plot
par(mfrow=c(1,1))
make_plot(sldf, best_detects)

# darken background
make_plot(sldf, best_detects, darken=2.5)

# change style of background
make_plot(sldf, best_detects, type="esri-topo")

# give each fish a unique color preserved through flights
par(mfrow=c(3,1))
make_plot(sldf, best_detects, col_by_fish=T, flight=1, darken=2.5)
make_plot(sldf, best_detects, col_by_fish=T, flight=2, darken=2.5)
make_plot(sldf, best_detects, col_by_fish=T, flight=3, darken=2.5)

# to plot the locations for a single fish
par(mfrow=c(1,1))
make_plot(sldf, best_detects, channel=10, tag_id=11, darken=2.5)

# to zoom in to a specified extent
extent <- c(x_min=466060, x_max=1174579, y_min=6835662, y_max=7499016)
make_plot(sldf, best_detects, extent, darken=2.5)

# plotting live and dead fish by flight period -- green fish have expired
par(mfrow=c(3,1))
make_plot(sldf, viterbi, type="bing", darken=2.5, viterbi=T, flight=1)
make_plot(sldf, viterbi, type="bing", darken=2.5, viterbi=T, flight=3)
make_plot(sldf, viterbi, type="bing", darken=2.5, viterbi=T, flight=5)